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SUMMARY 


Background 

Measures  of  cardiorespiratory  fitness  ^re  routinely  included  in 
physical  fitness  tests  (PFTs)  that  are  administered  for  personnel 
selection  or  to  monitor  the  fitness  of  a  population.  Typically, 
the  cardiorespiratory  measures  take  the  form  of  a  run  test.  Walk 
tests  may  be  a  viable  alternative  to  run  tests.  However,  much  of 
the  literature  on  walk  tests  is  recent.  To  date,  walk  test 
validity  has  not  been  directly  compared  with  run  test  validity. 

Objective 

This  report  provides  a  quantitative  summary  of  the  validity  of 
walk  tests  and  compares  walk  test  validity  with  run  test 
validity. 

Approach 

The  published  literature  was  reviewed  to  identify  studies  that 
related  walk  test  performance  to  laboratory  measures  of  maximal 
oxygen  uptake  capacity  (VOamax)  •  Meta-analysis  techniques  were 
used  to  average  the  reported  correlation  coefficients  and  compare 
them  with  the  average  values  of  the  same  statistics  for  run 
tests . 

Findings 

The  literature  search  produced  39  studies,  37  of  which  concerned 
1-km,  2-km,  1-Mile,  6-min,  or  12-min  walk  tests.  Walk  test 
performance  was  significantly  (p  <  10''®)  related  to  V02max  for'  each 
of  those  tests.  The  relationships  were  near  the  lower  boundary 
(i.e.,  r  =  .60)  for  acceptable  validity.  Each  walk  test  was  less 
valid  than  its  comparable  run  test.  However,  combining  walk  test 
performance  with  age,  weight,  gender,  and  exercise  heart  rate 
produced  regression  equations  that  predicted  V02max  as  well  as  run 
tests.  Standard  errors  of  estimate  were  5.01  ml*kg"^*min^^  .for  the 
walk  test 'for  men  and  3.78  ml*kg“^»min'^  for  women.  The  comparable 
run  test  values  were  4.69  ml»kg”^«min"^  and  3.38  ml»kg"^»min’^, 
respectively. 

Conclusions 

Walk  tests  are  valid  indicators  of  maximal  aerobic  capacity. 
However,  walk  test  performance  must  be  combined  with  information 
on  age,  weight,  gender,  and  exercise  heart  rate  to  produce  V02max 
estimates  that  are  as  good  as  the  run  tests  currently  used  in 
PFTs.  The  multivariate  approach  would  be  recommended  when  using 
walk  tests. 
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Introduction 


Running  performance  is  commonly  used  to  assess  aerobic 
fitness  in  military  physical  fitness  tests  (PFTs) .  A  substantial 
body  of  evidence  relating  running  performance  to  measured  maximal 
oxygen  uptake  capacity  (VOamax)  supports  this  practice.  Walk  tests 
are  an  alternative  method  of  estimating  aerobic  fitness  that  may 
be  preferable  in  some  situations.  Solway,  Brooks,  et  al.  (2001) 
provided  a  qualitative  review  of  the  evidence  supporting  the 
claim  that  walk  tests  are  valid  indicators  of  VOzmax-  This  review 
provides  a  quantitative  summary  of  that  evidence  and  a  comparison 
of  walk  tests  and  run  tests. 

This  report  focuses  on  walk  test  validity.  In  everyday 
conversation,  the  word  "valid"  conveys  the  idea  that  an  assertion 
is  "true,"  or  "correct."  Valid  has  a  narrower  technical 
definition  when  used  in  connection  with  testing  standards.  In 
this  context,  "validity"  refers  to  the  appropriateness  of  some 
interpretation  of  a  set  of  test  scores  (American  Psychological 
Association,  1985).  Test  validation  is  the  process  of  gathering 
empirical  evidence  to  support  the  proposed  interpretation (s)  of 
the  scores . 

Good  testing  practice  requires  that  the  validity  of  walk 
tests  be  demonstrated  empirically.  Evidence  that  walk  test 
performance  is  reliably  related  to  laboratory  V02max  measures  a 
critical  requirement  for  establishing  the  validity  of  walk  tests. 
This  evidence  is  critical  because  laboratory  measurements  of 
oxygen  uptake  during  treadmill  runs  or  bicycle  ergometer  rides 
are  accepted  as  the  best  available  methods  of  assessing  aerobic 
fitness.  Walk  tests  would  not  be  plausible  indicators  of  aerobic 
fitness  if  performance  were  not  related  to  this  accepted 
standard.  Therefore,  this  report  uses  meta-analytic  procedures 
(cf..  Cooper  &  Hedges,  1994;  Hedges  &  Olkins,  1985)  to  summarize 
the  available  evidence  bearing  on  the  claim  that  walk  tests  meet 
this  basic  validity  requirement. 

Walk  Test  Validity  Evidence 

Any  review  begins  with  a  search  for  relevant  studies .  For 
the  present  purposes,  a  relevant  study  was  one  that  reported  an 
empirical  estimate  of  the  association  between  walk  test 
performance  and  VOamax-  An  initial  list  of  relevant  studies  was 
constructed  from  the  Solway,  Brooks,  et  al.  (2001)  reference 
list.  This  list  was  extended  by  searching  the  PubMed®  database 
using  "walk  test"  as  the  search  term.  The  abstracts  for  the 
articles  identified  in  this  search  were  examined  to  determine 
whether  VOamax  had  been  measured.  If  so,  the  study  was  added  to 
the  list. 
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Copies  of  the  articles  in  the  original  list  were  obtained. 
The  articles  were  read  to  determine  which  ones  reported  the 
required  correlations.  When  a  correlation  was  reported,  the  paper 
was  read  to  identify  any  references  to  prior  studies  of  the 
performance-V02max  relationship.  Citations  not  previously 
identified  were  added  to  the  list  of  studies  to  be  reviewed. 

The  list  of  relevant  articles  was  completed  by  a  further 
search  of  the  PubMed  database.  PubMed  includes  a  "related 
articles"  function.  Once  an  article  of  interest  has  been 
identified,  clicking  a  button  generates  a  list  of  other  articles 
dealing  with  similar  subject  matter.  This  function  was  used  for 
each  study  identified  in  the  PubMed  search.  If  the  abstract  of  a 
related  article  suggested  that  a  relevant  correlation  might  be 
reported,  the  article  was  examined  to  determine  whether  it 
provided  evidence  that  should  be  added  to  the  database  for  this 
review. 

The  search  identified  39  studies  that  reported  at  least  1 
correlation  between  walk  test  performance  and  VOamax  (Appendix  A)  . 
The  cumulative  sample  size  was  1,927  participants.  The  samples 
were  not  representative  of  the  general  population.  Most  (n  = 

I, 117,  58.0%)  participants  were  patients  with  moderate  to  severe 
cardiac  or  respiratory  disease.  'The  average  age  of  the 
participants  ranged  from  7  years  to  68  years,  but  most  data  were 
from  samples  near  the  extremes  of  this  range  (<15  years,  n  =  221, 

II. 5%;  >50  years,  n  =  995,  51.6%).  Adult  samples  with  average 
ages  between  36  and  50  (n  =  628,  32.6%)  accounted  for  most  of  the 
remaining  data.  Only  about  1  of  every  25  (n  =  83,  4.3%) 
participants  was  from  a  sample  of  young  adults.  Because  patient 
populations  tended  to  be  older,  the  typical  study  participant  was 
a  patient  over  the  age  of  50 . 

Table  1  presents  the  basic  validity  evidence.  Table  2 
summarizes  that  evidence  on  a  test-by-test  basis.  The  cumulative 
evidence  leaves  no  doubt  that  walk  tests  are  related  to  VOzmax- 
Major  observations  were: 

A.  The, average  validity  coefficient  was  highly  significant  (p 
<  10“®)^  for  each  test  that  has  been  studied  in  more  than 
one  sample . 

B.  The  average  validity  coefficients  differed  significantly 
between  tests  (x^  =  13.29,  4  df,  p  <  .010).^ 

C.  The  run  test  was  more  valid  than  the  walk  test  for  the  1-km 
(z  =  3.33,  p  <  .001),  1-mile  (z  =  3.88,  p  <  .001),  and  2-km 
(z  =  3.60,  p  <  .001)  distances.  The  run  and  walk  were 
equivalent  for  the  12-min  test  (z  =  0.80,  p  >  .289).  The 

D. 


*  Determined  by  the  method  of  adding  Zs  (Rosenthal,  1978) . 
^  Determined  by  Hedges'  Q,  (Hedges  &  Olkin,  1985). 
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Table  1.  Basic  Validity  Findings 


Study 

Year 

Sample 

Size 

Validity 

Coefficient 

z 

SEE 

6-min  test 

Roul  1998 

121 

.24 

2.66* 

4.37 

Lipkin 

1986 

10 

.34 

.94 

2.72 

Montgomery 

1998 

64 

.37 

2.99* 

2.89 

Lipkin 

1986 

10 

.54 

1.60 

2.93 

Lipkin 

1986 

16 

.55 

2.21* 

1.27 

Lucas 

1999 

264 

.57 

10.46* 

4.11 

Opasich 

2001 

311 

.59 

11.89* 

3.55 

Faggiano 

1997 

26 

.63 

3.56* 

3.11 

Cahalin 

1996 

45 

.64 

4.91* 

3.07 

Cahalin 

1995 

30 

.67 

4.21* 

2.75 

Zugck 

2000 

113 

.68 

8.70* 

3.96 

Nixon 

1996 

17 

.70 

3.25* 

3.64 

Cahalin 

1995 

30 

.73 

4.83* 

2.80 

Riley 

1992 

11 

.88 

3.89* 

12-min  test 
Bernstein  1994 

9 

.65 

1.90* 

Nakagaichi 

1998 

25 

.73 

4.36* 

6.42 

Nakagaichi 

1998 

17 

.78 

3.91* 

4.32 

1-km  test 
Laukkanen 

1992 

32 

.47 

2.75* 

4.41 

Laukkanen 

1992 

45 

.63 

4.80* 

3.11 

l-mi  test 
Cureton 

1997 

92 

.27 

2.61* 

5.39 

McCormack 

1991 

17 

.34 

1.32 

4.33 

Jackson 

1994 

20 

.37 

1.60 

7.34 

Cureton 

1997 

53 

.38 

2.83* 

5.36 

McCormack 

1991 

27 

.49  . 

2.63* 

3.89 

Jackson 

1994 

21 

.55 

2.62* 

10.27 

Draheim 

1999 

23 

.73 

4.15* 

7.24 

Rintala 

1992 

19 

.81 

4.51* 

5.86 

McCormack 

1991 

15 

.82 

4.01* 

4.96 

2-km  test 
Laukkanen 

1993 

44 

.31 

2.05* 

7.32 

Laukkanen  . 

1992 

32 

.49 

2.89* 

4.36 

Laukkanen 

1993 

32 

.52 

3.10* 

5.12 

Oja 

1991 

35 

.58 

3.75* 

8.06 

Laukkanen 

1989 

79 

.61 

6.18* 

7.53 

Laukkanen 

1992 

45 

.72 

5.88* 

2.78 

Laukkanen 

1993 

35 

.73 

5.25* 

4.78 

Oja 

1991 

29 

.74  ■ 

4.85* 

4.51 

Laukkanen 

1989 

80 

.75 

8.54* 

6.28 

Miscellaneous  tests 

Mercer  1998  14 

.83 

3.94* 

1.80 

Singh 

1994 

19 

.88 

5.50* 

1.95 

Note.  "Study" 

=  senior 

author . 

"SEE"  =  standard  error 

of  estimate 

=  missing  data. 

*  p  <  . 

05,  one-tailed. 
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Table  2 .  Summary  of  Walk  Test  Validity  Results 


Test  #® 

n" 

Walk 

r= 

Znull^ 

Run 

r® 

Diff^ 

Zdiff® 

Sig 

Fixed-Time 

6  min  14 

1068 

.564 

17.66 

.481 

-.083 

-3.05 

<.004 

12  min  3 

51 

,738 

5.87 

.789 

.051 

.80 

>.289 

Fixed-Distance 

1  km  2  77 

.570 

5.34 

.779 

.209 

3.33 

<.001 

1  mile  9 

287 

.464 

8.76 

.631 

.167 

3.88 

<.001 

2  km  9 

411 

.635 

14.16 

.737 

.102 

3.60 

<.001 

^Number  of  samples . 

^Cumulative  sample  size. 

^Weighted  average  correlation  using  Fisher’s  r-to-z  transformation; 
weights  were  (n  -  3)  where  n  was  the  sample  size. 

^Test  of  p  =  0  by  the  method  of  adding  zs  (Rosenthal,  1978)  . 

®Weighted  average  correlation  for  run  tests  from  Vickers  (2001a,  2001 

b.) 

^Difference  =  (average  for  run  -  average  for  walk) . 

^z-value  for  run-walk  difference  with  the  run  average  treated  as  a  fixed 
value  (Hays,  1963,  pp.  528-532) . 

^Significance  of  run-walk  difference. 


walk  test  was  superior  for  the  6-min  test  (z  =  -3.05,  p  < 
.004) 

D.  Longer  walks  tended  to  be  more  valid  than  shorter  tests. 

For  fixed-time  tests,  the  12-min  walk  (r  =  ,.738)  was 
significantly  {Zditf  =  1.95,  p  <  .026,  one-tailed)  more 
valid  the  6-min  walk  (r  =  .564)  .  The  picture  was  less 
certain  for  fixed-distance  tests.  If  the  tests  were  ordered 
perfectly  by  length,  the  validity  of  the  1-mile  walk  would 
have  fallen  between  the  1-km  and  2-km  tests.  Instead  the  1- 
mile  walk  had  the  lowest  average  validity  (r  =  .464) .  The 
2-km  walk  (r  =  .635)  was  more  valid  than  the  1-km  walk  (r  - 
.570)  .  However,  age  contributed  to  this  confusion.^  When 
only  adult  samples  were  considered,  the  1-mile  walk 


^  2-scores,  including  the  differences  between  tests,  were  computed  using 
Fisher’s  r-to-z  transformation  (Hays,  1963,  pp.  528-532) . 

^  2-scores,  including  the  differences  between  tests,  were  computed  using 
Fisher’s  r-to-z  transformation  (Hays,  1963,  pp.  528-532). 

^Appendix  A  lists  the  studies  from  lowest  to  highest  validity 
coefficients.  Younger  samples  clearly  tended  to  be  listed  first  for  the 
1-mile  walk  test.  In  fact,  validity  was  significantly  lower  (z  =  -2.60, 
p  <  .014)  for  children  (r  =  .383)  than  for  adults  (r  =  .642).  Data  from 
Vickers  (2001a,  2001b)  showed  a  similar  trend  (under  16  years,  r  = 

.575;  over  16  years,  r  =  .677)  in  24  studies  of  the  1-mile  run.  The  1- 
mile  walk  was  less  valid  than  the  1-mile  run  for  children  (run,  r  = 

.575;  walk,  r  =  .383;  Zoiff  =  3.60,  p  <  .0007,  one-tailed),  but  not  for 
adults  (run,  r  =  .677;  walk,  r  =  .642;  z^iff  =  0.49,  p  >  .353,  one- 
tailed)  . 
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validity  (r  =  .642)  was  slightly  higher  than  the  2-kin  walk 
validity.  The  weak  general  tendency  toward  higher  validity 
for  longer  walks  was  statistically  significant  (x^  =  6.01  , 

1  df,  p  <  .015)  when  the  6-min  and  1-kiti  tests  were  combined 
and  contrasted  with  the  combined  12-min,  1-mile^  and  2-km 
tests. 

E.  If  r  =  .60  is  a  minimum  standard  for  validity  (Nunnally  & 
Bernstein^  1994),  the  12-min,  1-mile,  and  2-km  walks  were 
acceptable  tests.  The  6-min  and  1-km  walks  were  below  this 
criterion. 

Discussion 

Walk  tests  are  valid,  but  only  the  12-min,  1-mile,  and  2-km 
tests  met  minimum  validity  standards .  Even  the  average  validity 
of  those  tests  was  only  borderline  acceptable.  The  12-min  walk 
test  may  be  an  exception  to  this  generalization,  but  there  is  too 
little  evidence  available  at  this  time  to  place  much  confidence 
in  the  higher  average  validity  for  that  test.  Note  should  also  be 
taken  of  the  fact  that  the  average  validity  for  the  6-min  and  1- 
km  tests  was  just  below  the  minimum  validity  standard.  A  few 
additional  studies  with  higher  validities  could  change  the 
inferences  for  those  tests  as  well.  Thus,  it  should  be  remembered 
that  the  validity  difference  between  the  shorter  and  longer  tests 
achieved  statistical  significance  only  when  the  tests  were 
grouped  and  analysis  was  limited  to  adult  samples.  The  overall 
data  trends  were  too  weak  to  conclude  that  there  is  a  sound 
empirical  basis  for  choosing  among  the  walk  tests  at  this  time. 

The  inference  that  longer  tests  are  more  valid  than  shorter 
tests  should  be  viewed  with  caution,  but  not  discounted  all 
together.  This  suggestion  should  be  viewed  with  caution  because 
it  was  reached  in  several  steps.  The  results,  therefore,  might  be 
viewed  with  skepticism  because  they  involved  excessive  data 
manipulation.  However,  the  suggestion  that  the  trend  is  probably 
real  rests  partly  on  evidence  not  covered  in  this  review.  The 
validity  of  fixed-distance  run  tests  increases  with  the  logarithm 
of  distance  up  to  2  km  (Vickers,  2001a,  2001b) .  If  walk  tests  are 
analogous 'to  run  tests,  the  fact  that  validity  increases  with  the 
logarithm  of  distance  implies  that  the  tests  examined  here  will 
show  only  small  differences-  A  weak  trend  in  the  expected 
direction,  therefore,  may  be  all  that  could  be  expected. 

Multivariate  Walk  Test  Equations 

Multivariate  walk  test  equations  combine  walk  time  (t„)  with 
other  information  to  improve  the  precision  of  V02max  estimates. 
This  section  examines  two  multivariate  equations,  the  Rockport 
Fitness  Walk  Test  (RFWT;  Kline,  Porcari,  et  al . ,  1987)  and  the 
Urho  Kaleva  Kekkonen  Institute  Walk  Test  (UKKWT;  Oja,  Laukkanen, 
et  al . ,  1991).  Other  multivariate  walk  tests  have  been  developed 
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(Dolgener,  Hensley,  et  al.,  1994;  George,  Fellingham,  et  al., 
1998),  but  those  tests  are  not  covered  here.  The  sample  of 
participants  in  the  Dolgener,  Hensley,  et  al.  (1994)  study 
appears  to  be  atypical.  As  a  result,  their  equations  do  not 
perform  well  in  new  samples  (Appendix  B) .  The  George,  Fellingham, 
et  al.  (1998)  equations  have  not  been  studied  enough  to  reach 
firm  conclusions  about  their  value  at  this  time. 

Rockport  Fitness  Walking  Test 

The  RFWT  consists  of  3  equations  developed  by  Kline, 

Porcari,  et  al.  (1987)  .  The  equations  predict  VOamax  based  on  the 
time  required  to  complete  a  1-mile  walk,  heart  rate  (HR)  at  the 
end  of  the  walk,  age,  weight,  and  gender. 

RFWT  Equations .  The  RFWT  equations  were  developed  with  data 
from  88%  of  390  volunteers  who  underwent  VOamax  testing.  The  other 
12%  (n  =  47)  failed  to  meet  established  criteria  for  a  valid 
V02niax  test.  The  participants  were  divided  into  two  groups.  Data 
from  one  group  (n  =  174;  92  females,  82  males)  were  used  to 
develop  the  equations.  Data  from  the  other  group  (n  =  169;  86 
females,  83  males)  were  used  to  cross-validate  the  equations. 

The  research  design  restricted  the  sample  to  people  between 
30  and  69  years  of  age.  Average  ages  were  46.5  years  for  males 
and  48.5  years  for  females.  Average  V02max  was  42.2  (SD  =  9.8) 

V02max  mi«kg"^*min'^  for  men  and  31.4  (SD  =  8.5)  ml*kg’^*min’^  for 
women,  values  for  study  participants  were  close  to  what  would  be 
expected  given  the  ages  of  the  samples  (Fitzgerald,  Tanaka,  et 
al.,  1997;  Wilson  &  Tanaka,  2000). 

Laboratory  treadmill  measurements  of  V02max  were  the 
dependent  variable  in  the  RFWT  equations .  Participants  ran  on  the 
treadmill  at  a  self-selected  pace.  The  test  began  with  the 
treadmill  at  0%  grade.  The  grade  was  increased  2.5%  every  2  min. 
Participants  were  encouraged  verbally.  The  test  stopped  when  the 
individual  was  unable  to  continue  despite  the  encouragement.  The 
criteria  for  determining  that  a  true  maximal  oxygen  uptake  had 
been  achieved  during  the  test  were  (a)  VO2  leveled  off  during  the 
test  despite  an  increase  in  work,  (b)  the  respiratory  exchange 
ratio  (RER)  reached  or  exceeded  1.10,  (c)  the  exercise  HR  was 

less  than  15  beats  per  minute  below  age-predicted  maximal  HR. 
Measured  VO2  uptake  was  accepted  as  a  valid  V02max  when  at  least  2 
of  the  3  criteria  were  met. 

Fourteen  (14)  potential  predictors  were  considered.  During 
each  test,  participants  walked  1  mile  as  fast  as  they  could. 

Heart  rate  was  monitored  and  recorded  during  the  walks.  Walk  time 
for  the  mile  and  4  HRs  were  recorded.  The  heart  rates  were  the 
average  values  for  the  last  1  min  of  each  one  quarter  mile  of  the 
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Table  3.  Rockport  Fitness  Walk  Test  Equations 


Generalized  Equation: 

V02max  =  132.853  -  ( .0769*weight)  -  (0.3877*age)  +  (6.315*sex)  - 
{3.2649*time)  -  (.1565*HR) 

Gender-Specific  Male  Equation: 

V02„ax  =  154.889  -  (.0947*weight)  -  (0.3709*age)  -  (3 . 9744*time) 

-  (.1847*HR) 

Gender-Specific  Female  Equation: 

V02max  =  116.579  -  ( . 0585*weight)  -  (0.3885*age)  -  (2 . 7961*time) 

-  (.1109*HR) 

Note.  These  equations  are  taken  from  Kline,  Porcari,  et  al.  (1^87) . 
Weight  was  measured  in  pounds,  age  in  years,  and  time  in  minutes.  Sex 
was  coded  0  for  females  and  1  for  males .  Heart  rate  was  measured  during 
the  last  1  min  of  the  first  1-mile  walk. 


walk.  Each  participant  completed  the  walk  test  at  least  twice.  If 
the  times  (t„s)  were  within  30  s  of  each  other,  the  two  walks 
were  accepted  as  providing  acceptable  performance  measures.  If 
the  t„s  for  the  first  two  tests  differed  by  more  than  30  s,  "... 
subsequent  walks  were  performed  until  this  criterion  was  met" 
(Kline,  Porcari  et  al.,  1987,  p.  255).  The  14  potential 
predictors  included  the  10  walk  test  measurements  plus  age, 
weight,  height,  and  gender.  The  "best  subsets"  regression 
procedure  from  the  BMDP  computer  package  (Dixon,  Brown,  et  al., 
1990)  was  used  to  establish  the  final  regression  equations. 

Kline  et  al.  (1987)  developed  2  predictive  models  (Table 
3) .  The  first  model  consisted  of  a  single  regression  equation  for 
men  and  women  (Generalized  Equation) .  Gender  was  a  predictor  in 
this  model .  The  second  model  had  separate  equations  for  men  and 
women  (Gender-Specific  Equations) .  The  multiple  correlations  were 
high  (Generalized,  R  =  .88;  males,  R  =  .85;  females,  R  =  .86). 

The  standard  error  of  estimate  (SEE)  was  5.0  for  the  Generalized 
Equation,  5.3  for  the  male  equation,  and  4.5  for  the  female 
equation.  If  prediction  errors  were  random  and  normally 
distributed,  true  V02max  would  have  a  95%  probability  of  being 
within  ±2  SEE  of  the  predicted  value. 

RENT  Cross-Validations.  The  RFWT  equations  have  been 
extensively  cross-validated  (Table  4).®  Each  cross-validation 


This  review  covers  10  of  12  studies.  Dolgener,  Hensley,  et  al.  (1994) 
were  dropped  as  an  outlier  (see  Appendix  B) .  Ward,  Wilkie,  et  al. 
(1987)  were  excluded  because  they  did  not  indicate  which  equation (s) 
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Table  4,  Cross-Validation  of  the  RFWT  Equations 


Study 

Year 

Gender 

n 

r 

SEE 

Bias 

Generalized  Equation 

Draheim 

1999 

C 

23 

.74 

7.10 

4.75 

Coleman 

1987 

C 

90 

.79 

5.62 

0.10 

Kittredge 

1994 

C 

25 

.81 

4.22 

10.00 

George 

1998 

C 

98 

.84 

3.58 

5.00 

Kline 

1987 

C 

169 

.88 

4.94 

1 

o 

I-* 

o 

0' Hanley 

1987 

C 

29 

.88 

2.71 

-5.30 

Widrick 

1992 

C 

145 

.91 

5.10 

-0.60 

Coleman 

1987 

F 

50 

.62 

5.49 

1.40 

George 

1998 

F 

59 

.71 

2.96 

6.00 

Fenstermaker 

1992 

F 

16 

.78 

2.07 

-0.15 

0' Hanley 

1987 

F 

19 

.84 

2.28 

o 

00 

VD 

1 

Widrick 

1992 

F 

75 

.86 

4.34 

1.40 

Stanforth 

1999 

F 

36 

.89 

4.10 

0.60 

Coleman 

1987 

M 

40 

.79 

5.70 

-1.50 

George 

1998 

M 

39 

.79 

3.86 

3.60 

0' Hanley 

1987 

M 

10 

.81 

3.17 

-2.50 

Widrick 

1992 

M 

70 

.88 

5.18 

o 

00 

(N 

1 

Stanforth 

1999 

M 

•  31 

.89 

5.29 

-2.20 

Gender-Specific  Equations 

Zwiren 

1991 

F 

38 

.73 

4.51 

1.50 

Fenstermaker 

1992 

F 

16 

.79 

2.02 

0.13 

Widrick 

1992 

F 

75 

.85 

4.48 

0.70 

Kline 

1987 

F 

86 

.86 

3.83 

-0.10 

Stanforth 

1999 

F 

36 

.87 

4.44 

0.50 

Kline 

1987 

M 

83 

.84 

5.70 

-0.30 

Widrick 

1992 

M 

70 

.88 

5.18 

-2.90 

Stanforth 

1999 

M 

31 

.89 

5.29 

-2.60 

Note.  Results  are  grouped  by  equation  and  gender  (C  =  Combined  male  and 
female;  F  =  Female;  M  =  Male) .  Studies  are  ordered  within  groups  based 
from  lowest  to  highest  cross-validation  coefficient.  Multiple 
correlations  in  development  were  R  =  .88  for  the  Generalized  Equation, 

R  =  .86  for  the  female  Gender-Specific  equation,  and  R  =  .85  for  the 
male  Gender-Specific  equation. 

determined  the  age,  gender,  weight,  HR,  and  t„  for  new  samples  of 
people.  The  values  of  these  predictors  were  inserted  into  the 
RFWT  ec[uations  to  predict  each  individual's  V02n,ax.  Each  cross- 
validation  also  included  a  laboratory  measurement  of  VO?n,.v . 
Correlation  coefficients  relating  the  predicted  and  measured 


they  examined.  This  omission  made  it  impossible  to  determine  where 
their  study  fit  into  the  overall  body  of  cross-validation  evidence. 
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Table^  5.  Summary  of  RFWT  Equation  Results 


Cross- 

Validation 


Equation 

Gender 

Zn 

r 

SEE 

Bias 

Generalized 

Combined 

579 

.87  , 

4.80 

1.04 

Female 

255 

.79 

3.92 

1.64 

Male 

190 

.85 

4.94 

-1.10 

Gender-Specific 

Female 

251 

.84 

4.10 

0.48 

Male 

184 

.86 

5.43 

-1.68 

Note.  Average  values  for  r  were  computed  with  the  Fisher  r-to-z 
transformation  and  weighted  by  (n  -  3)  where  "n"  was  the  sample  size, 
then  reversing  the  r-to-z  transformation.  Multiple  correlations  in 
development  were  R  =  .88  for  the  Generalized  Equation,  R  =  .86  for  the 
female  Gender-Specific  Equation,  and  R  =  .85  for  the  male  Gender- 
Specific  Equation. 


V02max  values  were  computed.  These  correlations,  known  as  cross- 
validation  coefficients,  are  the  focus  of  this  section. 

Three  aspects  of  the  cross-validations  are  important. 

First,  equations  developed  in  one  sample  may  be  weak  predictors 
of  individual  differences  when  applied  to  data  from  a  new  sample. 
In  this  case,  the  average  cross-validation  coefficients  indicated 
that  the  predicted  V02max  values  were  strongly  related  to  the 
observed  values.  The  average  coefficients  were  r  =  .87  for  the 
Generalized  Equation,  r  =  .84  for  the  female  Gender-Specific 
Equation,  and  jt  =  .86  for  the  male  Gender— Specific  Equation. 

Equations  also  may  be  biased  when  applied  to  data  from  new 
samples.  Bias  occurs  when  estimated  values  tend  to  be 
consistently  lower  or  consistently  higher  than  observed  values  of 
the  criterion.  The  RFWT  equations  were  biased  because  the  average 
predicted  value  was  1.04  ml«kg‘^*min"^  too  high  for  the  Generalized 
Equation,  0.48  ml*kg’^*min'^  too  high  for  the  female  Gender- 
Specific  Equation,  and  1.68  ml*kg"^*min'^  too  low  for  the  male 
Gender-Specific  Equation.  The  presence  of  bias  was  not  surprising 
since  statistical  considerations  associated  with  using  a  sample 
to  represent  a  population  make  it  very  likely  that  at  least  some 
bias  will  be  present  in  any  cross-validation.  The  important  point 
in  the  present  case,  therefore,  was  that  the  biases  were  too 
small  to  be  of  practical  or  theoretical  importance.^ 


This  interpretation  was  reached  by  converting  the  bias  estimates  to 
effect  sizes  (ESs) .  The  bias  was  divided  by  estimates  of  the  standard 
deviation  of  V02max  (SDV02max)  •  SDV02max  is  ~6.00  ml*kg*^*min'^  for  samples 
of  people  who  are  similar  in  age  and  activity  level.  SDVOjmax  increases 
to  ~8.00  ml»kg  ^•min"^  to  -10.00  ml»kg‘^*min‘^  when  wider  ranges  of  age  and 
activity  levels  are  represented  (e.g.,  Kline,  Porcari,  et  al.,  1987). 
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Shrinkage  is  a  third  criterion  for  evaluating  the  cross- 
validation  performance  of  multiple  regression  equations. 
Regression  equations  developed  using  data  from  one  sample 
typically  are  less  accurate  when  the  equation  is  applied  to  data 
from  a  new  sample.  The  difference  between  the  original  accuracy 
and  the  accuracy  in  the  new  sample  is  shrinkage.®  The  shrinkage 
of  the  RFWT  equations  was  trivial,  amounting  to  .01  for  the 
Generalized  Equation  and  .02  for  the  female  Gender-Specific 
Equation.  For  males,  the  average  cross-validation  coefficient 
actually  was  .01  larger  than  the  original  multiple  correlation.® 

Table  5  also  provides  a  basis  for  evaluating  the  utility  of 
the  Gender-Specific  Equations.  Those  equations  would  be  useful  if 
they  improved  significantly  on  the  Generalized  Equation.  The 
cross-validation  coefficients  for  the  Generalized  Equation  in 
unisex  samples  are  the  proper  comparison  bases?  for  determining 
the  additional  variance  explained  by  the  Gender-Specific 
Equations.  These  coefficients  remove  gender  differences  in 
performance  and  V02max  from  the  analysis. 

The  Gender-Specific  Equations  were  slightly  more  accurate 
than  the  Generalized  Equation.  The  average  cross-validation 
coefficient  for  the  Gender-Specific  Equation  for  males  was  r  = 
.86.  This  figure  was  .01  higher  ’than  the  average  cross-validation 
coefficient  obtained  when  the  Generalized  Equation  was  applied  to 
males.  The  difference  favoring  the  Gender-Specific  Equation  was 
larger  (.05)  for  women  (r  =  .79  vs.  r  =  .84),  but  an  outlier  data 
point  made  the  trend  misleading.  Removing  Coleman,  Wilkie  et  al. 
(1987)  from  the  analysis,  the  average  cross-validation  r  for  the 
Generalized  Equation  increased  to  .83,  only  .01  less  than  the 
cross-validation  r  for  the  gender-specific  equation. 

Discussion 

The  RFWT  equations  cross-validated  well.  The  average  cross- 
validation  coefficient  was  high  (r  >  .84),  shrinkage  was  low  (^ 
.02),  and  bias  was  minor  {ES  ^  0.28).  The  evidence  also  provided 
reason  to  prefer  the  Generalized  Equation  to  the  Gender-Specific 


Pairing  the  largest  bias  with  the  smallest  standard  deviation  yields, 

ES  0.28  (i.e.,  1.68/6).  All  other  combinations  yield  ES  ^  0.21. 
Cohen’s  (1988)  widely  used  criteria  set  ES  ^  0.20  as  the  lower  boundary 
for  an  effect  with  practical  or  theoretical  importance  (Cohen,  1988) . 
^Paraphrasing  Wherry  (1984,  p.  74)  the  average  cross-validation 
coefficient  will  be  lower  than  the  original  multiple  correlation 
because  the  initial  regression  computations  fit  errors  of  measurement 
as  well  as  real  trends  in  the  original  data.  The  adjustments  to 
regression  coefficients  to  fit  the  unique  errors  of  the  development 
sample  do  not  apply  to  the  measurement  and  sampling  errors  in  a  new 
sample.  Thus,  less  variance  will  be  explained,  and  the  average  value  of 
the  correlation  coefficient  will  be  lower.  The  lowering  is  shrinkage. 

Greater  accuracy  is  possible  because  shrinkage  is  an  average  effect, 
not  an  inevitable  occurrence. 
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Equations.  The  Gender-Specific  Equations  were  slightly  more 
accurate  in  cross-validation,  but  the  gain  was  too  modest  to 
replace  a  single  equation  with  separate  gender  equations.^® 

UKK  Walk  Test 

The  UKKWT  research  provided  an  independent  replication  of 
key  elements  of  the  RFWT  findings.  The  study  participants  were 
drawn  from  a  different  population.  The  method  of  developing  the 
predictive  equations  was  different.  The  strategy  used  in  cross¬ 
validating  the  equations  was  different.  Nevertheless,  the  results 
reinforced  key  points  identified  for  the  RFWT.  In  addition,  the 
UKKWT  studies  reported  the  predictive  accuracy  of  sample- 
optimized  regression  equations.  This  information  provided  a 
different  frame  of  reference  for  interpreting  the  accuracy  of  the 
basic  UKKWT  equations. 

Ecfuation  Development .  Subjects  were  recruited  from 
participants  in  a  questionnaire  study  of  health  conducted  in  a 
city  in  Finland  (Oja,  Laukkanen,  et  al.,  1991).  The  study  was 
design  provided  a  representative  sample  of  20-  to  65-year-old  men 
and  women  in  that  city.  The  UKKWT  validation  study  included  VOzmax 
tests  for  10  men  and  10  women  selected  at  random  from  each  of 
four  age  groups  (20-25  years,  35-40  years,  50-55  years,  and  60-65 
years)  .  Complete  V02max  and  walk  test  data  were  obtained  from  64 
subjects,  29  women  (age  =39.1  years,  SD  =  13.4)  and  35  men  (age 
=  41.9  years,  SD  =  14.0)  .- 

V02max  was  measured  on  a  treadmill.  Testing  began  with  a  5- 
min  walk  at  0%  grade.  Speed  was  individually  chosen  between  4.5 
and  5.5  km/hr.  After  5  min,  the  treadmill  grade  was  increased  to 
5%.  The  5%  grade  was  maintained  for  2  min  after  which  the  grade 
was  increased  to  7.5%.  Grade  subsequently  was  increased  2.5% 
every  2  min  up  until  a  grade  of  20%  was  reached.  Once  the  20% 
grade  was  reached,  speed  was  increased  0.5  km/hr  every  2  min.  The 
measured  V02„ax  was  accepted  as  valid  if  HR  was  within  15%  of  age- 
predicted  maximum,  RER  was  at  least  1.0,  and  blood  lactate  was  at 
least  4.0  mmol/1.  Average  V02max  was  34.8  ml*kg'^*min'^  (SD  =  6.7 

ml*kg'^*min'^)  for  women  and  43.1  ml*kg"^*min‘^  (SD  =9.9  ml«kg'^*min’ 
^)  for  men. 

Walking  performance  was  assessed  on  a  flat  500-m  stretch  of 
dirt  road.  Separate  walk  tests  were  performed  over  distances  of 
1.0,  1.5,  and  2.0  km.  Preliminary  analyses  showed  that  2-km 
performance  had  the  strongest  relationship  to  V02niax/  so  this 


^®The  principle  of  parsimony  is  the  basis  for  preferring  the  generalized 
equation.  Parsimony  focuses  on  the  trade-off  between  model  complexity 
and  model  explanatory/predictive  power.  Taking  the  number  of  parameters 
as  an  index  of  model  complexity  (Popper,  1959),  gender-specific 
equations  increased  complexity  67%  (from  6  to  10  parameters)  with  only 
a  1%  improvement  in  accuracy. 
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distance  provided  the  performance  measures  for  predicting  VOamax* 
Average  t„  was  16.9  min  (SD  =  1.2  min)  for  women  and  15.2  min  {SD 
=  1.4  min)  for  men. 

The  UKKWT  equations  combined  t„  with  age,  HR,  and  either 
height  and  weight  or  body  mass  index  (BMI) .  Walk  time  was  entered 
first,  followed  by  age,  then  HR.  After  these  variables  were 
entered,  weight  and  height  were  added  to  the  equation  as  separate 
predictors  or  as  a  single  BMI  (i.e.,  weight/height^)  predictor. 
Equations  were  developed  separately  for  women  and  men.  The  model 
with  weight  as  a  predictor  was  slightly  more  accurate  for  women. 
The  model  with  BMI  was  slightly  more  accurate  for  men.  The  BMI 
equations  were  adopted  for  subsequent  studies . 

Cross-Validation.  The  UKKWT  research  findings  are 
summarized  in  Table  6.  The  major  inferences  from  the  data  are: 

A.  The  equations  were  accurate  in  the  development  sample. 

The  multiple  correlations  in  those  samples  were 
comparable  to  the  values  for  the  corresponding  gender- 
specific  RFWT  equations  (males,  UKKWT  R  =  .83  vs.  RFWT  R 
=  .85;  females,  UKKWT  R  =  .84  vs.  RFWT  R  =  .86). 

B.  Shrinkage  was  substantial.  UKKWT  cross-validation 
coefficients  were  substantially  lower  than  the  original 
Rs  (men,  r  =  .71;  women,  r  =  .69) . 

C.  Bias  was  somewhat  larger  than  for  the  RFWT  equations. 
UKKWT  equations  consistently  underestimated  VOzmaxr  with 
an  average  bias  of  -3.87  for  men  and  -1.15  for  women. 

D.  The  bivariate  predictor-criterion  relationships 

underlying  the  equations  were  stable  across  samples.  The 
correlations  relating  V02max  to  individual  predictors 
varied  across  samples,  but  the  differences  were  no  larger 
than  expected  by  chance  (t„,  =  12.26,  6  df,  p  >  .056; 

age,  x^  =  6.94,  6  dfr  p  >  .326;  BMI,  (x^  =  3.52,  6  df,  p  > 
.741;  HR,  x^  =  3.03,  6  df,  p  >  .805).“ 

E.  The  multivariate  approach  provided  significantly 
better  prediction  of  the  criterion  than  did  the 
univariate  approach  based  on  t„.  Adding  age,  weight, 
height,  and  HR  to  t„  accounted  for  significantly  more 
variation  in  V02max.  The  added  value  of  these  predictors 
can  be  seen  by  comparing  the  correlation  between  t„  and 
V02max  with  the  multiple  R  for  each  sample.  The  F-test  and 
significance  level  given  under  each  multiple  R  in  Table  6 
show  that  the  increase  in  predictive  accuracy  was 
statistically  significant  (p  <  .029)  in  6  of  7  samples. 
The  combined  trend  was  highly  significant  (p  <  10"®)  . 

“Hedges's  Q  (Hedges  &  Olkin,  1985)  was  used  to  test  for  significant 
differences  in  the  correlation  coefficients  across  samples.  The  Q 
values  were  computed  applying  the  SPSS  GLM  procedure  (SPSS,  Inc., 
Chicago,  IL,  1998a,  1998b)  to  Fisher-transformed  correlations  with  (n- 
3)  as  the  weighting  factor. 
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Table  6.  Validity  of  the  UKKWT 


Sample 


Development  Obese 


Moderately  Highly 

Active  Active 


F 

M 

F 

N  = 

29 

35 

45 

Age  (in 

years) 

M 

39.1 

42.9 

42.4 

SD 

13.4 

14.0 

8.8 

VO2  (in 

ml^kg^^^min"*^) 

M 

34.8 

43.1 

27.2 

SD 

6.7 

9.9 

4.0 

Correlation  of  VOamax  with: 

Age 

-.43 

-.51 

-.35 

BMI 

-.58 

-.51 

-.35 

tw 

-.74 

-.58 

-.72 

HR 

.04 

.09 

.07 

M 

F 

M 

M 

32 

32 

35 

44 

41.3 

8.8 

40.6 

4.5 

40.2 

4.7 

44.8 

5.6 

36.6 

5.0 

36.2 

6.0 

44.4 

7.0 

57.6 

7.7 

-.45 

.02 

-.39 

-.23 

-.60 

-.34 

-.48 

-.54 

-.49 

CM 

IT) 

1 

-.73 

-.31 

1 

0 

GO 

.18 

-.03 

-.18 

Multivariate  Equations 


Mult  R 

.83 

.84 

.79 

.75 

.58 

.83 

.67 

r 

3.64 

12.54 

3.75 

6.63 

.90 

5.01 

8.32 

P< 

.028 

.001 

.019 

.002 

.457 

.007 

.001 

Cross  r 

.77 

.75 

.55 

.79 

.60 

.66 

.00 

.28 

1.25 

1.26 

P> 

.652 

.999 

.922 

.311 

.301 

Bias 

-0.9 

-4.3 

-1.5 

-3.3 

-4.0 

SEEf= 

3.3 

5.1 

2.55 

3.31 

5.01 

4.29 

6.16 

Note.  Development  =  Oja,  Laukkanen,  et  al.  (1991);  obese  samples  = 
Laukkanen,  Oja,  et  al.  (1992);  moderately  and  highly  active  samples  = 
Laukkanen,  Oja,  et  al.  (1993).  M  =  Male,  F  =  Female.  Mult  R  -  multiple 
correlation  coefficient  for  the  sample-specific  equation.  Cross  r  = 
cross-validation  coefficient  for  UKKWT  equation.  Bias  =  predicted  minus 
observed  score.  SEE  =  standard  error  of  estimate. 

=  MSreg/MSres  =  [  (SSreg/ df^e^)  /  {SSres/ dfres)  ]  =  [  (i^"  "  rtw^)/3)/[(l  -  R^)  /  {n 
-  5)  ]  .  F  is  the  F-test,  MS,  SS,  and  df  are  the  mean  square,  sum  of 
squares,  and  degrees  of  freedom  respectively.  Subscripts  ”reg"  and 
"res”  indicate  that  the  statistic  refers  to  the  regression  and  the 
residuals,  respectively.  R^  is  the  squared  multiple  correlation 
coefficient,  rtw^  is  the  squared  correlation  of  V02max  with  t„,  and  n  is 
sample  size,  MSreg  has  3  df  because  the  computations  reflect  variance 
explained  by  age,  body  mass  index  (BMI),  and  heart  rate  (HR). 

^F  =  MSreg/MSces  -  t  ( SSreg/df reg)  /  ( SSrea /df res )  ]  =  [  {R^  -  CrOS S-Validat ion 
-  R^)  /  {n  -  5)]  where  n  is  the  sample  size. 

^Computed  from  reported  data  as  [V(l  -  R^)]*SD  where  SD  is  the  sample 
standard  deviation. 
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F.  The  UKKWT  equations  were  nearly  optimal  in  each  sample. 
Each  study  reported  a  regression  equation  developed  to 
optimize  the  prediction  of  V02max  in  that  sample.  These 
sample-optimized  equations  used  the  same  predictors  as 
the  UKKWT  equations,  but  selected  regression  weights  that 
produced  the  smallest  possible  prediction  errors  for  the 
sample.  The  multiple  Rs  for  the  sample-optimized 
equations  averaged  .03  larger  (range  =  .00  to  .07)  than 
the  cross-validation  coefficient.  The  F-test  and 
significance  levels  given  below  the  cross-validation  Rs 
in  Table  6  show  the  modest  size  of  these  gains.  The 
improvement  in  predictive  accuracy  obtained  by 
substituting  the  sample-optimized  equations  for  the  UKKWT 
equations  did  not  approach  significance  in  any  of  the  5 
samples  (p  >  .301  for  each) . 

Discussion.  The  UKKWT  studies  underscored  the  value  of  a 
multivariate  approach.  The  predictive  utility  of  this  approach 
was  clearly  evident.  Adding  age,  BMI,  and  exercise  HR  accounted 
for  an  average  of  22%  more  of  the  variance  in  VOamax  than  was 
explained  by  t„  alone.  The  cumulative  trend  was  highly 
significant  statistically. 

The  inclusion  of  HR  in  the’ UKKWT  equations  may  appear 
problematic.  The  simple  bivariate  correlation  between  this 
predictor  and  V02max  is  close  to  zero.  The  likely  explanation  is 
that  HR  becomes  a  significant  predictor  after  controlling  for  the 
other  variables  in  the  equations.  The  studies  did  not  report  the 
full  matrix  of  correlation  coefficients,  so  this  speculation 
could  not  be  evaluated  directly  from  the  data. 

The  evaluation  of  shrinkage  is  more  complex.  The  cross- 
validation  coefficients  were  substantially  smaller  than  the 
initial  multiple  Rs.  However,  this  trend  appears  to  derive  from 
the  choice  of  cross-validation  strategies.  The  UKKWT  equations 
were  developed  in  a  sample  drawn  from  a  general  population.  The 
cross-validation  studies  were  conducted  in  specialized  subgroups 
from  within  that  general  population.  As  might  be  expected,  V02max 
was  more  variable  in  the  general  population  than  in  the 
subpopulations  (Table  6) .  Other  things  equal,  less  variation  in 
the  criterion  means  weaker  associations  to  predictors  As  a 
result,  the  comparison  between  the  cross-validated  equations  and 
the  sample-optimized  equations  is  probably  a  better  indicator  of 
shrinkage.  The  difference  in  this  comparison  was  only  .03,  so  it 
is  reasonable  to  conclude  that  shrinkage  was  modest  after 
allowing  for  the  restricted  variability  in  V02max. 

Bias  was  somewhat  more  problematic  for  the  UKKWT  equations 
than  for  the  RFWT  equations.  The  bias  estimate  for  men  was  large 


^^This  restriction  of  range  effect  is  a  well-known  statistical  artifact 
in  meta-analyses  (Hunter  &  Schmidt,  1990)  . 
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enough  to  be  considered  a  moderate  effect  size  (Cohen,  1988) .  The 
difference  for  women  was  too  small  to  be  important. 

Two  important  generalizations  about  multivariate  walk  test 
equations  can  be  drawn  from  the  combined  RFWT  and  UKKWT  findings. 
First,  multivariate  equations  are  competitive  with  run  tests  as 
aerobic  fitness  indicators.  The  maximum  validity  for  run  tests  is 
r  «  .74  for  fixed  distance  tests  and  r  «  .82  for  fixed-time  tests 
(Vickers,  2001a,  2001b) .  The  multiple  Rs  for  the  multivariate 
equations  are  above  this  range,  but  the  average  cross-validation 
coefficients  fall  in  this  same  range.  Second,  the  multivariate 
character  of  the  equations  is  important.  Considering  age,  weight, 
gender,  and  exercise  HR  improves  the  prediction  of  VOamax-^^  The 
only  noteworthy  problem  for  the  multivariate  equations  is  the 
possibility  that  the  predicted  values  have  enough  bias  to  limit 
their  utility.  The  evidence  for  this  problem  is  limited  to  the 
data  for  the  UKKWT  applied  to  men. 

Test  Precision 

« 

Validity  coefficients  do  not  provide  a  complete  basis  for 
comparing  tests.  Validity  is  a  prerequisite  for  sound  testing 
practices,  but  focusing  solely  on  validity  can  be  misleading  when 
choosing  among  valid  tests.  Test  precision  should  be  considered 
as  well.  The  SEE  is  the  statistical  index  for  test  precision.  The 
SEE  formula  is 


SEE  =  V(1  -  r^)*S£). 

This  formula  combines  test  validity  (i.e.,  r)  with  the  sample 
standard  deviation  of  V02max  (i.e.,  SD)  . 

The  fact  that  the  SEE  formula  includes  SD  renders  validity 
an  imperfect  guide  to  test  precision.  Tests  with  equal  validity 
coefficients  could  have  very  different  precision.  This  outcome 
would  result  if  one  test  has  been  validated  in  samples  with  large 
SDs  for  VOamax  (e.g.,  the  general  adult  population  between  30  and 
70  years  of  age)  and  the  other  in  more  homogenous  populations 
(e.g.,  elite  runners). 

Table  7  provides  SEE  estimates  for  men  and  women  on  univariate 
walk  tests  and  multivariate  walk  tests. Also,  that 


^^The  utility  of  the  combined  set  of  predictors  has  been  established. 
However,  some  individual  predictors  may  contribute  little  to  the 
predictive  accuracy  of  the  equations.  If  so,  the  equations  could  be 
simplified  by  dropping  those  predictors.  This  issue  is  outside  the 
scope  of  this  review. 

^^The  6-min  walk  was  not  included.  All  studies  of  this  test  involved 
mixed-gender  samples  of  patients  (Appendix  A) .  The  sexes  presumably 
were  not  separated  because  patient  status  was  more  important  than 
gender.  The  average  SEE  for  the  6-min  walk  was  3.67. 
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Table  7.  SEE  Values  for  Different  Tests 


Test _ 

Univariate  Walk 


Males 


Females 


1- mi 

2 - km 
12-min 

Average® 

Multivariate  Walk 
RFWT  General 
RFWT  Specific 
UKKWT 

Average® 

Run 

1- mi 

2 - km 
12-min 
1.5-mi^ 

2- mi^ 

3- mi^ 

Average® 


6.21 

5.89 

6.19 

4.05 

5.58 

N/A 

5.99 

4.47 

4.83 

3.76 

5.41 

4.01 

4.78 

3.56 

5.01 

3.78 

4.74 

4.71 

5.89 

3.45 

3.82 

3.35 

4.30 

3.90 

4.70 

2.44 

4:68 

2.44 

4.69 

3.38 

Note.  "N/A"  =  not  available.  Table  entries  are  in  ml®kg'^«min“^ .  Define 
RFWT,  UKKWT  here. 

®Unweighteci  average, 

^Distance  used  in  PFT  for  one  branch  of  military  services  in  the  U.S. 
Department  of  Defense. 


table  gives  SEEs  for  run  tests  covering  walk  test  distances  or 
times .  Separate  values  have  been  reported  for  men  and  women 
because  gender  clearly  affected  test  precision.  The  SEE  for  males 
was  larger  than  that  for  females  in  all  11  comparisons  provided 
in  Table  7 . 

The  1.5-,  2-,  and  3-mile  runs  have  not  been  considered  in 
previous  sections  of  this  paper.  These  runs  were  added  to  Table  7 
because  they  are  elements  of  PFTs  in  different  service  branches 
within  the  U.S.  Department  of  Defense.  These  PFTs  probably 
represent  the  most  extensive  use  of  run  tests  to  evaluate  aerobic 
fitness  in  the  adult  population.  Combining  these  measures  with 
the  1-mile,  2-km,  and  12-min  run  tests  provides  a  more  extensive 


^^The  6-min  walk  was  not  included.  All  studies  of  this  test  involved 
mixed-gender  samples  of  patients  (Appendix  A) .  The  sexes  presumably 
were  not  separated  because  patient  status  was  more  important  than 
gender.  The  average  SEE  for  the  6-min  walk  was  3.67. 
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basis  for  comparing  walk  tests  with  alternative  run  tests.  The 
estimated  values  of  SEE  for  these  run  tests  were  derived  from 
data  covered  in  previous  reviews  of  run  test  validity  (Vickers, 
2001a,  2001b) . 

Two  general  conclusions  can  be  drawn  from  Table  7.  First, 
multivariate  walk  tests  are  more  accurate  than  univariate  walk 
tests  (0.98  ml»kg"^*min“^  for  men,  0.69  ml*kg'^«min~^  for  women). 
Second,  run  tests  are  slightly  more  accurate  than  multivariate 

walk  tests  (0.32  ml*kg”^»min"^  for  men  and  0.40  ml«kg'^»min"^  for 
women) . 

The  SEES  for  the  RFWT  equations  provided  another  point  of 
interest.  The  gender-specific  SEE  was  larger  than  the  generalized 
SEE.  The  inequality  held  for  both  men  and  women.  This  finding  is 
further  support  for  the  prior  suggestion  that  the  Generalized 
Equation  for  the  RFWT  is  preferable  to  the  Gender-Specific 
Equations  for  that  test  (p.  11) .  The  previous  recommendation  was 
based  on  a  preference  for.  simplicity.  The  evidence  in  Table  7 
indicates  that  the  preference  for  simplicity  does  not  entail  a 
loss  of  accuracy. 

The  SEE  values  can  be  used  to  choose  a  field  test  to 
estimate  VOamax*  Run  tests  are  preferable  to  multivariate  walk 
tests.  Multivariate  walk  tests,  in  turn,  are  preferable  to 
univariate  walk  tests.  This  ordering  applies  if  all  other  things 
are  equal.  However,  a  multivariate  walk  test  might  be  chosen  over 
a  run  test  if  the  test  will  be  administered  to  a  population  of 
older  individuals  who  might  be  at  increased  risk  of  injury  during 
the  test.  A  univariate  walk  test  might  be  preferred  to  a 
multivariate  walk  test  to  avoid  the  requirements  for  collecting 
and  analyzing  additional  data  (i.e.,  age,  weight,  exercise  HR). 
The  SEE  estimates  can  be  used  to  weigh  the  gains  in  terms  of 
reduced  risk  and  ease  of  administration  against  the  loss  of 
precision  in  the  V02„ax  estimates. 

The  SEE  computations  also  clearly  indicate  that  choices 
between  tests  should  not  be  based  solely  on  validity 
coefficients.  Using  the  average  validity  coefficient  as  the 
criterion  of  choice,  multivariate  walk  tests  would  rank  ahead  of 
run  tests.  Vickers  (2001a,  2001b)  estimated  the  upper  limit  of 
validity  for  run  tests  at  r  =  .82.  The  cross-validation 
coefficients  for  multivariate  walk  tests  exceeded  this  upper 
limit  (cf..  Table  4,  p.  8).  However,  the  multivariate  walk  test 
coefficients  were  derived  in  samples  with  greater  variation  among 
subjects  than  was  typical  in  the  studies  of  run  tests.  The  net 
result  was  that  multivariate  equations  explained  a  larger 
proportion  of  the  variance  in  V02n,ax/  but  still  left  more  residual 
error  variance. 
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Other  Issues 

Walk  tests  are  valid  and  competitive  with  other  field 
measures  of  aerobic  fitness.  With  these  points  established,  this 
section  briefly  considers  some  other  issues  that  might  affect  the 
decision  to  use  a  walk  test. 

Safety  concerns  can  make  walk  tests  an  attractive  option. 
These  tests  merit  special  attention  when  test  population  members 
are  at  risk  for  musculoskeletal  injury,  heart  attacks,  and  other 
adverse  health  consequences  from  heavier  exertion.  Properly 
supervised  walk  tests  are  safe  even  in  highly  vulnerable 
populations.  Walk  tests  have  been  used  extensively  in  severely 
ill  patient  populations,  primarily  those  suffering  from  cardiac 
disease  and  chronic  lung  disease.  No  significant  problems  with 
the  walk  tests  have  been  reported  in  the  literature  on  patient 
populations .  Several  authors  have  explicitly  mentioned  this  issue 
and  noted  that  either  no  problems  or  only  minor  problems  arose 
during  testing  (Cahalin,  Mathier,  et  al . ,  1996;  Cahalin, 
Pappagianoulous,  et  al.,  1995;  Langenfeld,  Mathier,  et  al . ,  1990; 
Nixon,  Joswiak,  et  al.,  1996;  Riley,  McParland,  et  al.,  1992; 
Roul,  Germain,  et  al.,  1998).  If  the  test  is  safe  in  these 
populations,  the  risk  of  adverse  effects  in  a  healthy,  generally 
active  population  between  40  and  60  years  of  age  must  be  minimal. 

Practice  effects  are  a  concern.  People  should  practice  the 
walk  tests  to  ensure  that  their  performance  reflects  the  best 
they  can  do.  Several  studies  have  shown  that  performance  improves 
when  a  walk  test  is  repeated  once  or  twice.  A  single  practice 
trial  apparently  is  enough  to  stabilize  performance  in  healthy 
normal  adults  (Jackson  &  Solomon,  1994) . 

The  fitness  of  the  population  being  tested  may  be  a 
concern.  Walk  tests  may  not  provide  sufficient  challenge  to 
permit  fit,  active  individuals  to  utilize  their  full  aerobic 
capacity  (e.g.  Widrick,  Ward,  et  al.,  1992).  If  so,  walk  tests 
will  systematically  underestimate  aerobic  capacity  in  such 
individuals . 

Conclusions 

Walk  tests  are  valid  indicators  of  aerobic  capacity.  Simple 
walk  tests  (i.e.,  time  to  cover  1  mile  or  2  km  or  distance 
covered  in  12  min)  satisfy  minimum  standards  for  estimating 
V02max-  Multivariate  walk  test  equations  that  add  age,  weight, 
gender,  and  exercise  HR  to  walk  time  provide  more  accurate 
estimates.  The  precision  of  V02max  estimates  provided  by  the 
multivariate  equations  is  very  close  to  that  of  endurance  runs, 
including  the  runs  currently  used  in  PFTs. 
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Table  A-1.  Fixed-Distance  Walk  Tests 
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and  standard  deviation  for  the  relevant  variable.  Under  V02max  "Resid"  =  sample  mean  -  predicted  mean  based 
on  age  using  equations  of  Fitzgerald,  Tanaka,  et  al.  (1997)  and  Wilson  &  Tanaka  (2000).  "Obs.  r"  is  the 
observed  correlation;  "Adj .  r"  is  the  correlation  adjusted  for  restriction  of  range.  The  z-value  is  the 
test  of  the  null  hypothesis  that  r  =  0  using  the  Fisher  r-to-z  transformation  (p  <  .05  if  z  >  1.95).  "SEE" 
is  the  standard  error  of  estimate  (see  p,  16) . 


Table  A-2.  Fixed-Time  Walk  Tests 
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tr  the  study. 


CO 

CO 

o 

00 

iH 

o 

o 

1— f 

cn 

VO 

rH 

to 

VO 

to 

00 

CM 

to 

O 

CM 

00 

VO 

4J 

(d 

• 

• 

• 

• 

* 

• 

• 

• 

• 

■ 

• 

• 

'  • 

• 

• 

■ 

• 

• 

> 

rH 

•rH 

cn 

o 

o 

lD 

o 

LO 

o 

00 

rH 

m 

CM 

CM 

CM 

00 

rH 

VO 

O 

VO 

rH 

o 

P 

T-i 

tH 

1 

1 

1 

1 

1 

1 

1 

1 

1 

<D  • 

pc 

C 

LD 

o 

CM 

CM 

00 

rH 

o 

r- 

O 

VO 

r- 

00 

cn 

00 

cn 

VO 

r- 

00 

o 

o 

rH 

VO 

CM 

IT) 

CJ> 

r- 

CM 

r- 

00 

rH 

rH 

CM 

00 

cn 

o 

CM 

00 

rH 

•H 

• 

• 

■ 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

■ 

• 

t 

to 

r- 

lO 

m 

CM 

lO 

r- 

lO 

CO 

fO 

lO 

LO 

lO 

CM 

CM 

CM 

m 

•H 

•— t 

fd 

> 

00 

lO 

(Ti 

CTS 

o 

CO 

CM 

o 

rH 

CM 

cn 

00 

VO 

CM 

o 

O' 

00 

r- 

r- 

1 

r- 

CM 

<T> 

CM 

CTi 

r- 

O 

CM 

LO 

cn 

CM 

to 

CM 

cn 

VO 

r* 

00 

cn 

rH 

CO 

N 

• 

• 

• 

« 

« 

• 

< 

■ 

« 

• 

• 

• 

• 

• 

• 

• 

t 

■ 

• 

■ 

• 

CO 

tH 

LO 

r-l 

r~ 

00 

VO 

VO 

CM 

rH 

r- 

VO 

oo 

o 

00 

o 

iH 

1-H 

rH 

1-H 

rH 

rH 

M 

U 

a\ 

CTl 

f— 1 

00 

00 

i-H 

CM 

<y\ 

cn 

rH 

00 

a\ 

rH 

CM 

rH 

00 

VO 

<n 

kO 

r- 

r- 

00 

00 

00 

00 

<y\ 

r- 

00 

00 

00 

VO 

r- 

O' 

00 

00 

00 

w 

0) 

•H 

n 

to 

r- 

VO 

CM 

CM 

VO 

r- 

oo 

o 

00 

ro 

cn 

VO 

rH 

o 

CM 

oo 

CM 

in 

o 

p 

O 

Q 

• 

• 

• 

• 

» 

• 

■ 

• 

< 

• 

• 

• 

• 

• 

• 

• 

4J 

♦H 

CO 

to 

o 

cn 

r* 

VO 

o 

in 

CM 

00 

cn 

M3 

in 

o 

rH 

CO 

r- 

00 

00 

cn 

tn 

-P 

rH 

rH 

rH 

rH 

rH 

to 

X 

-M 

•H 

ta 

to 

u 

eg 

<D 

0) 

o 

E-i 

■M 

> 

tn 

U 

VO 

t;r 

in 

00 

CM 

00 

O 

00 

00 

00 

r- 

CM 

CM 

rH 

00 

VO 

rd 

* 

t 

• 

• 

■ 

» 

• 

« 

• 

• 

• 

• 

• 

• 

• 

• 

* 

• 

• 

c 

$H 

VO 

LO 

cn 

cn 

CM 

r- 

CM 

VO 

cn 

cn 

rH 

rH 

in 

cn 

rH 

CM 

o 

•rH 

fd 

00 

00 

CM 

M* 

00 

CM 

M’ 

tn 

CM 

in 

00 

CM 

CM 

00 

M 

x: 

r-H 

o 

fd 

S 

0) 

rH 

to 

0* 

r- 

in 

00 

VO 

M’ 

r- 

00 

LO 

CM 

cn 

o 

00 

LO 

m 

VO 

cn 

to 

B 

0) 

• 

* 

» 

• 

• 

• 

■ 

• 

• 

» 

• 

• 

* 

• 

• 

• 

• 

• 

cu 

fd 

cn 

rH 

LO 

00 

CM 

r- 

00 

r- 

cn 

in 

oo 

VO 

r*- 

cn 

m 

rH 

cn 

00 

00 

VO 

c 

4J 

CO 

< 

rH 

CM 

CM 

00 

CM 

r- 

00 

rH 

CM 

CM 

r- 

00 

CM 

rH 

CM 

CM 

VO 

r- 

00 

CM 

•H 

Pl4 

AJ 

u 

o 

VO 

00 

O 

in 

00 

cn 

<n 

m 

O 

o 

cn 

o 

o 

rH 

VO 

O 

cn 

VO 

cn 

in 

VO 

AC 

q 

cn 

CM 

cn 

CM 

cn 

VO 

CM 

O 

00 

I-H 

r- 

00 

cn 

m 

in 

rH 

rH 

r~ 

00 

u 

rH 

rH 

rH 

rH 

U 

pc 

m 

o 

>^ 

to 

u 

M 

(1) 

cn 

00 

r- 

r- 

CM 

r- 

00 

r- 

CM 

cn 

0- 

00 

CM 

r- 

CM 

<n 

(d 

fd 

c; 

cn 

cn 

00 

cn 

cn 

00 

00 

cn 

cn 

00 

cn 

00 

cn 

cn 

to 

cn 

00 

cn 

cn 

00 

cn 

cn 

s 

(U 

0 

CL  cn 

cn 

cn 

cn 

cn 

(n 

cn 

a\ 

cn 

cn 

cn 

cn 

cn 

cn 

0 

cn 

cn 

cn 

cn 

cn 

<n 

cn 

g 

>H 

•H 

e 

rH 

rH 

rH 

rH 

rH 

rH 

rH 

rH 

CO 

rH 

rH 

rH 

rH 

rH 

rH 

rH 

rH 

rH 

rH 

rH 

rH 

rH 

HJ 

fd 

0 

Oh 

C/3 

fd 

CO 

U 

D 

tj 

fd 

0 

t 

tP 

CO 

a: 

00 

QJ 

<D 

fd 

X 

<d 

X 

1 

c: 

u 

Cn 

>1 

to 

U 

>1 

p 

0 

M 

B 

>1 

p 

< 

>1 

»-H 

0) 

B 

c 

73 

0 

a: 

0 

c 

0 

AC 

M 

0 

C 

P 

0 

a: 

u 

T3 

fd 

c 

fd 

0) 

0 

rH 

o 

0 

C 

0 

0 

r-H 

O 

O 

fd 

C 

fd 

0 

0 

rH 

u 

0 

<u 

P 

U 

0) 

<U 

B 

u 

Cn 

0 

c 

•H 

0 

B 

Cn 

c 

•H 

4H 

0 

B 

Cn  P 

c 

-H 

p 

I-H 

4J 

QJ 

0 

tn  x: 

<u 

+J 

M 

C 

fd 

U 

fd 

tn 

0 

M 

0 

U 

C 

Cn 

0 

U 

M 

0 

M 

c 

Xi 

CO 

c; 

o 

rH 

fd 

rH 

■p 

o 

•H 

ffi 

TJ 

rH 

o 

ac 

T3 

0 

rH 

I-H 

0 

C 

XI 

Tl 

0 

cd 

o 

u 

o 

-rH 

0 

rH 

» 

-H 

o 

o 

0 

• 

-iH 

P 

O 

O 

(!) 

(1) 

•rH 

p 

Eh 

CD 

Q 

Q 

o 

fcci 

o 

o 

& 

Q 

u 

CD 

o 

CO 

a 

u 

CD 

1^ 

b 

CO 

(continued  on  next  page) 


N 

H 

P 

0 

P 

P 

•H 

\j\ 

0 

0 

P 

0 

P 

(0 

CO 

rH 

CO 

VO 

in 

in 

rH 

r- 

rH 

LO 

43 

C 

P 

P 

CT> 

4J 

fd 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

eh 

C 

0 

.f-j 

iH 

-H 

00 

O 

CM 

CM 

00 

rH 

o 

o 

O 

o 

0 

0 

> 

P 

■ 

D 

PQ 

1 

1 

1 

1 

• 

43 

o 

0 

0 

CO 

>1  P 

•rH 

0 

3 

0 

*D 

P 

0 

•H 

H 

PCJ 

P 

P 

♦H 

43 

0 

P 

O 

c 

P 

3 

> 

C 

r- 

o 

00 

a\ 

00 

rH 

CM 

00 

CO 

«;3i 

W 

tjs 

0 

O 

CvJ 

r- 

rH 

CM 

CO 

in 

O 

00 

P 

’H 

0 

•H 

XJ 

•H 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

0 

w 

0 

0 

P 

0 

-M 

to 

r- 

m 

lO 

in 

r- 

CM 

CO 

43 

0 

P 

O 

> 

td 

P 

p 

0 

0 

•H 

P 

Tl 

♦H 

O 

Xl 

0 

•rl 

U 

0 

•H 

0 

0 

1— 1 

O 

43 

C 

XJ 

P 

P 

<d 

P 

P 

O 

C 

0^ 

0 

> 

rH 

CM 

KD 

CM 

o 

as 

VO 

VO 

00 

VO 

•H 

•H 

1 

a\ 

CM 

in 

CM 

kt 

00 

VO 

O' 

VO 

0 

W 

P 

0 

0 

CO 

N 

• 

• 

* 

• 

• 

• 

• 

• 

• 

• 

N 

•H 

0 

K 

p 

P 

m 

o 

rH 

in 

CO 

o 

rH 

r- 

•H 

rH 

0 

p 

P 

o 

rH 

rH 

rH 

W 

W 

0 

0 

U 

0 

P 

•rH 

0 

3 

L> 

0 

P 

P 

m 

0 

0 

iH 

rH 

O 

c 

p 

P 

Q* 

0 

o 

0 

P 

,  1 

CM 

5T 

00 

a\ 

rH 

CO 

as 

in 

VO 

E 

> 

p 

u 

Jh! 

igi 

00 

00 

00 

r- 

r- 

00 

00 

CX> 

0 

0 

*H 

P 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

w 

N 

43 

VO 

XI 

0 

EH 

tH 

3 

0 

P 

*H 

0 

P 

0 

p 

p 

rH 

P 

C 

a 

0 

C 

O 

0 

0 

w 

E 

•H 

0 

-H 

•r! 

•H 

p 

P 

0 

P 

P 

0 

0 

3 

o 

E 

0 

o 

q 

o 

P 

> 

•r| 

0 

0 

•H 

P 

0 

P 

rH 

P 

u 

• 

• 

p 

0 

a 

•H 

*H 

U 

Eh 

c 

E 

0 

XJ 

<D 

0 

0 

0 

O 

0 

•H 

43 

• 

p 

0 

a 

P 

CO 

o 

ID 

<T> 

VO 

rH 

VO 

CO 

in 

in 

o 

P 

P 

p 

Oi 

p 

u 

Q 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

P 

C 

0 

< 

+J 

-H 

CO 

00 

O 

O 

00 

VO 

00 

00 

r- 

as 

0 

0 

P 

0 

CO 

-M 

rH 

rH 

tH 

•H 

1 

P 

p 

p 

(0 

»< 

P 

0 

o 

0 

p 

-p 

•H 

IQ 

e 

o 

•H 

p 

P 

0 

w 

U 

CN 

*H 

P 

1 

0 

3 

0 

dJ 

0 

o 

C 

P 

p 

p 

rH 

0 

Eh 

■P  > 

0 

0 

0 

p 

u 

CO 

00 

r- 

CM 

CO 

rH 

00 

CM 

VO 

w 

0 

p 

0 

> 

0 

tn 

cd 

• 

• 

* 

« 

• 

• 

• 

• 

» 

• 

u 

0 

p 

o 

C 

u 

Js; 

KD 

CM 

a\ 

rH 

rH 

tH 

rH 

CM 

o 

0 

43 

0 

Td 

•H 

•H 

(d 

m 

CM 

00 

CO 

43 

c 

0 

E 

0 

XJ 

P 

o 

•H 

•H 

> 

3 

rH 

U 

•H 

Pm 

P 

p 

•H 

(0 

0 

P 

0 

0 

S 

0 

0 

0 

0 

0 

0 

0 

rH 

P 

T3 

43 

P 

0 

CO 

a 

CTV 

o 

CO 

o 

VO 

00 

as 

0 

•H 

P 

P 

O 

•H 

CO 

E 

0 

• 

• 

• 

• 

• 

t 

• 

• 

t 

« 

a  H 

o 

P 

0) 

fd 

a\ 

00 

VD 

r- 

a\ 

CO 

as 

00 

00 

VO 

•H 

0 

tP 

XJ 

c 

CO 

< 

1— • 

CO 

CM 

rH 

CO 

VO 

00 

CM 

V 

> 

C 

p 

c 

0 

-M 

c 

1 

•H 

o 

0 

> 

*H 

•H 

0 

0 

p 

*H 

Pm 

0 

P 

p 

Td 

P 

O 

0 

0 

0 

+J 

>1 

P 

Tl 

p 

Os 

M 

•0 

o 

0 

T) 

o 

0 

O 

' 

p 

P 

P 

-rH 

3 

a. 

o 

CO 

O 

rH 

VO 

00 

VO 

in 

VO 

VO 

p 

0 

P 

0 

X3 

,i4 

c: 

o 

00 

r- 

CO 

as 

CO 

rH 

r- 

00 

00 

CO 

43 

a  Tj 

0 

< 

u 

rH 

P 

E 

3 

P 

o 

o 

0 

CU 

• 

CC 

n 

0 

o 

P 

0 

0 

0 

0 

0 

3 

M-I 

X) 

>  o 

P 

rH 

o 

0 

•H 

0 

P 

0 

0 

tr» 

II 

43 

> 

>1 

p 

p 

3 

u 

u 

CM 

cn 

rH 

CM 

CM 

as 

p 

0 

XJ 

CO 

fd 

<y\ 

00 

a\ 

as 

CO 

as 

as 

as 

00 

as 

c 

P 

0 

0 

0 

G 

0 

u 

a\ 

<T» 

<T\ 

a\ 

0 

as 

as 

CTl 

as 

CTi 

Os 

E 

? 

p 

•H 

> 

>H 

0 

rH 

rH 

rH 

rH 

rH 

rH 

rH 

tH 

rH 

iH 

P 

0 

P 

P 

3 

0 

& 

p 

TJ 

43 

0 

0 

CO 

*H 

M 

o 

0 

P 

W 

P 

0 

0 

& 

0 

0 

o 

TJ 

w 

P 

• 

0 

CO 

0 

0 

CO 

0 

o 

ro 

0 

43 

0 

P 

0 

0 

-H 

U 

1 

Co 

CO 

u 

p 

0 

U 

E 

P 

43 

43 

0 

3 

0 

< 

>1 

1 

0 

u 

-H 

0 

U 

U 

Eh 

0 

p 

0 

p 

V 

0 

c 

o 

o 

0 

C 

c 

0 

U 

0 

C 

43 

LO 

P 

p 

Q) 

0 

0 

0 

0 

•H 

p 

6 

0 

0 

p 

-H 

0 

p 

• 

E 

P 

as 

0 

P 

tJ 

•S 

tn 

C 

U 

c 

tn 

p 

w 

M 

c 

13 

0 

P 

O 

p 

P 

3 

CO 

rH 

•H 

TJ 

0 

•iH 

c 

XJ 

•H 

0 

P 

rH 

rH 

P 

0 

fd 

m 

O 

rH 

-H 

p 

o 

s 

0 

-H 

rH 

P 

o 

o 

•rH 

P 

eh 

O 

Q 

fcc: 

CO 

Q 

Pm 

CO 

55 

u 

43 

A 

Td 

P 

Appendix  B 

Evaluation  of  Dolgener,  Hensley,  et  al.  (1994)  Cross-validation 

of  RFWT 

Additional  analyses  were  carried  out  to  better  understand 
the  poor  predictive  accuracy  of  the  RFWT  equations  in  the 
Dolgener  et  al.  (1994)  data.  Was  this  result  the  product  of 
special  characteristics  of  the  sample  or  was  it  a  failure  of  the 
equations?  Two  lines  of  evidence  were  available  to  answer  this 
question.  First,  Dolgener  et  al.  (1994)  developed  sample  specific 
equations  using  the  same  predictors  as  the  RFWT.  If  the  RFWT 
equations  had  been  at  fault,  these  sample-specific  equations 
would  be  much  more  accurate  predictors  of  V02max  than  were  the 
RFWT  equations .  This  expectation  was  not  met .  The  equation  for 
women  provided  no  increase  in  accuracy  at  all  compared  to  the 
corresponding  RFWT  equation  (r  =  .41  for  each) ,  The  equation  for 
men  did  improve  on  the  RFWT  (r  =  .51  vs.  r  =  .432)  .  The 
improvement  was  statistically  significant  (F4,92  =  2.69,  p  <  .036) 
if  this  comparison  were  considered  in  isolation  from  the  results 
for  females . 

The  results  of  individual  significance  tests  must  be  viewed 
with  caution  when  multiple  tests  are  performed  (Dunn,  1961) .  When 
I'^'^dtiple  tests  are  performed,  the  significance  criterion  for 
individual  tests  should  be  set  higher  than  if  a  single  test  were 
performed.  The  increased  stringency  for  individual  tests  allows 
for  the  probability  that  at  least  one  test  will  be  significant  by 
chance  alone  (Dunn,  1961) .  Because  two  significance  tests  were 
performed,  an  adjusted  significance  criterion  of  p  <  .025  (i.e., 
.05/2)  was  appropriate  for  the  present  case.  The  improvement  for 
males  was  not  statistically  significant  by  this  criterion.  Thus, 
the  RFWT  equations  were  just  as  good  as  the  best  sample-specific 
equations.  Classifying  the  equations  as  having  "failed"  to  cross- 
validate  when  they  were  nearly  as  accurate  as  the  best  possible 
predictive  equation  for  the  sample  was  not  reasonable. 

A  cross-validation  of  the  Dolgener  equations  in  a  sample  of 
females  (Fontenot,  2001)  provided  the  second  line  of  evidence 
indicating  that  these  equations  represented  an  outlier  data  set. 
The  cross-validation  produced  a  low  validity  coefficient  for  the 
predicted  V02niax  using  either  the  Generalized  Equation  (r  =  .499) 
or  the  female-specific  equation  (r  =  .458).  Fontenot's  (2001) 
analyses  also  indicated  that  the  Dolgener  et  al.  (1994)  equations 
had  substantial  predictive  bias.  The  equations  consistently 
underestimated  observed  V02max  values. 

The  combination  of  weak  predictive  accuracy  for  sample- 
specific  equations  in  the  original  Dolgener  et  al.  (1994)  data 
with  poor  cross-validation  of  the  Dolgener  equations  in  a  new 
sample  suggests  that  the  Dolgener  et  al.  (1994)  sample  was 
atypical.  If  so,  the  most  important  finding  in  these  analyses  was 
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that  the  RFWT  equations  fit  the  Dolgener  et  al.  (1994)  data  about 
as  well  as  possible.  These  points  suggest  that  the  Dolgener  et 
al.  (1994)  study  produced  outlier  values  because  their  gample  was 
atypical.  The  reasons  for  this  atypical  character  are  not 
obvious,  but  the  points  considered  here  were  sufficient  to 
justify  dropping  the  study  as  an  outlier  (Barnett  &  Lewis,  1978). 
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13.  SUPPLEMENTARY  NOTES 

14.  ABSTRACT  (maximum  200  words) 

Military  physicai  fitness  tests  (PFTs)  use  distance  runs  to  assess  aerobic  fitness.  Walk  tests  are  alternatives  to  this 
practice.  This  meta-analysis  summarized  39  studies  (1 ,927  participants)  relating  waik  test  performance  to  laboratory 
measures  of  maximal  oxygen  uptake  (VOamax).  The  laboratory  measures  are  the  accepted  gold  standard  for  assessing 
aerobic  fitness.  For  adults,  the  average  walk  test  performance-V02niax  correlation  was  r  =  .56  for  a  6-min  walk,  r  =  .74  for  a 
12-min  walk,  r  =  .^  for  a  1-km  walk,  r  =  .64  for  a  1-mlle  walk,  and  r  =  .64  for  a  2-km  walk.  Each  average  value  was  highly 
significant  (p  <  10"®).  All  of  the  averages  were  lower  than  would  be  obtained  with  run  tests  (r  >  .74),  so  the  review  was 
extended  to  consider  multivariate  equations  combining  walk  test  performance  with  age,  weight,  gender,  and  exercise  heart 
rate  to  predict  VOamax-  These  equations  have  predicted  V02max  accurately  and  cross-validate  well.  The  standard  error  of 
estimate  (SEE)  for  V02max  predictions  from  these  equations  was  only  0.32  to  0.40  ml*kg'’*min'’  larger  than  that  for 
equivalent  statistic  for  run  tests.  Walk  tests  are  valid  and  are  comparable  to  run  tests  as  indicators  of  V02max  when  the 
multivariate  approach  is  used. 
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